51 research outputs found

    Understanding and Measuring Psychological Stress using Social Media

    Full text link
    A body of literature has demonstrated that users' mental health conditions, such as depression and anxiety, can be predicted from their social media language. There is still a gap in the scientific understanding of how psychological stress is expressed on social media. Stress is one of the primary underlying causes and correlates of chronic physical illnesses and mental health conditions. In this paper, we explore the language of psychological stress with a dataset of 601 social media users, who answered the Perceived Stress Scale questionnaire and also consented to share their Facebook and Twitter data. Firstly, we find that stressed users post about exhaustion, losing control, increased self-focus and physical pain as compared to posts about breakfast, family-time, and travel by users who are not stressed. Secondly, we find that Facebook language is more predictive of stress than Twitter language. Thirdly, we demonstrate how the language based models thus developed can be adapted and be scaled to measure county-level trends. Since county-level language is easily available on Twitter using the Streaming API, we explore multiple domain adaptation algorithms to adapt user-level Facebook models to Twitter language. We find that domain-adapted and scaled social media-based measurements of stress outperform sociodemographic variables (age, gender, race, education, and income), against ground-truth survey-based stress measurements, both at the user- and the county-level in the U.S. Twitter language that scores higher in stress is also predictive of poorer health, less access to facilities and lower socioeconomic status in counties. We conclude with a discussion of the implications of using social media as a new tool for monitoring stress levels of both individuals and counties.Comment: Accepted for publication in the proceedings of ICWSM 201

    Tracking Fluctuations in Psychological States Using Social Media Language: A Case Study of Weekly Emotion

    Full text link
    Personality psychologists are increasingly documenting dynamic, within‐person processes. Big data methodologies can augment this endeavour by allowing for the collection of naturalistic and personality‐relevant digital traces from online environments. Whereas big data methods have primarily been used to catalogue static personality dimensions, here we present a case study in how they can be used to track dynamic fluctuations in psychological states. We apply a text‐based, machine learning prediction model to Facebook status updates to compute weekly trajectories of emotional valence and arousal. We train this model on 2895 human‐annotated Facebook statuses and apply the resulting model to 303 575 Facebook statuses posted by 640 US Facebook users who had previously self‐reported their Big Five traits, yielding an average of 28 weekly estimates per user. We examine the correlations between model‐predicted emotion and self‐reported personality, providing a test of the robustness of these links when using weekly aggregated data, rather than momentary data as in prior work. We further present dynamic visualizations of weekly valence and arousal for every user, while making the final data set of 17 937 weeks openly available. We discuss the strengths and drawbacks of this method in the context of personality psychology’s evolution into a dynamic science. © 2020 European Association of Personality PsychologyPeer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/163564/3/per2261-sup-0001-Open_Practices_Disclosure_Form.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/163564/2/per2261.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/163564/1/per2261_am.pd

    What Twitter Profile and Posted Images Reveal About Depression and Anxiety

    Full text link
    Previous work has found strong links between the choice of social media images and users' emotions, demographics and personality traits. In this study, we examine which attributes of profile and posted images are associated with depression and anxiety of Twitter users. We used a sample of 28,749 Facebook users to build a language prediction model of survey-reported depression and anxiety, and validated it on Twitter on a sample of 887 users who had taken anxiety and depression surveys. We then applied it to a different set of 4,132 Twitter users to impute language-based depression and anxiety labels, and extracted interpretable features of posted and profile pictures to uncover the associations with users' depression and anxiety, controlling for demographics. For depression, we find that profile pictures suppress positive emotions rather than display more negative emotions, likely because of social media self-presentation biases. They also tend to show the single face of the user (rather than show her in groups of friends), marking increased focus on the self, emblematic for depression. Posted images are dominated by grayscale and low aesthetic cohesion across a variety of image features. Profile images of anxious users are similarly marked by grayscale and low aesthetic cohesion, but less so than those of depressed users. Finally, we show that image features can be used to predict depression and anxiety, and that multitask learning that includes a joint modeling of demographics improves prediction performance. Overall, we find that the image attributes that mark depression and anxiety offer a rich lens into these conditions largely congruent with the psychological literature, and that images on Twitter allow inferences about the mental health status of users.Comment: ICWSM 201

    Predicting And Characterizing The Health Of Individuals And Communities Through Language Analysis Of Social Media

    Get PDF
    A large and growing fraction of the global population uses social media, through which users share their thoughts, feelings, and behaviors, predominantly through text. To quantify the expression of psychological constructs in language, psychology has evolved a set of “closed-vocabulary” methods using pre-determined dictionaries. Advances in natural language processing have made possible the development of “open-vocabulary” methods to analyze text in data-driven ways, and machine learning algorithms have substantially improved prediction performances. The first chapter introduces these methods, comparing traditional methods of text analysis with newer methods from natural language processing in terms of their relative ability to predict and elucidate the language correlates of age, gender and the personality of Facebook users (N = 65,896). The second and third chapters discuss the use of social media to predict depression in individuals (the most prevalent mental illness). The second chapter reviews the literature on detection of depression through social media and concludes that no study to date has yet demonstrated the efficacy of this approach to screen for clinician-reported depression. In the third chapter, Facebook data was collected and connected to patients’ medical records (N = 683), and prediction models based on Facebook data were able to forecast the occurrence of depression with fair accuracy–about as well as self-report screening surveys. The fourth chapter applies both sets of methods to geotagged Tweets to predict county-level mortality rates of atherosclerotic heart disease mortality (the leading cause of death in the U.S.) across 1,347 counties, capturing 88% of the U.S. population. In this study, a Twitter model outperformed a model combining ten other leading demographic, socioeconomic and health risk factors. Across both depression and heart disease, associated language profiles identified fine-grained psychological determinants (e.g., loneliness emerged as a risk factor for depression, and optimism showed a protective association with heart disease). In sum, these studies demonstrate that large-scale text analysis is a valuable tool for psychology with implications for public health, as it allows for the unobtrusive and cost-effective monitoring of disease risk and psychological states of individuals and large populations

    Big data methods, social media, and the psychology of entrepreneurial regions: capturing cross-county personality traits and their impact on entrepreneurship in the USA

    Get PDF
    There is increasing interest in the potential of artificial intelligence and Big Data (e.g., generated via social media) to help understand economic outcomes. But can artificial intelligence models based on publicly available Big Data identify geographical differences in entrepreneurial personality or culture? We use a machine learning model based on 1.5 billion tweets by 5.25 million users to estimate the Big Five personality traits and an entrepreneurial personality profile for 1,772 U.S. counties. The Twitter-based personality estimates show substantial relationships to county-level entrepreneurship activity, accounting for 20% (entrepreneurial personality profile) and 32% (Big Five traits) of the variance in local entrepreneurship, even when controlling for other factors that affect entrepreneurship. Whereas more research is clearly needed, our findings have initial implications for research and practice concerned with entrepreneurial regions and eco-systems, and regional economic outcomes interacting with local culture. The results suggest, for example, that social media datasets and artificial intelligence methods have the potential to deliver comparable information on the personality and culture of regions than studies based on millions of questionnaire-based personality tests

    Lifestyle and wellbeing: Exploring behavioral and demographic covariates in a large US sample

    Get PDF
    Using data from a nationally representative sample of 46,179 US adults from the Gallup-Healthways Wellbeing Index, we investigate covariates of four subjective mental wellbeing dimensions spanning evaluative (life satisfaction), positive affective (happiness), negative affective (worry), and eudaimonic wellbeing. Negative covariates were generally more strongly correlated with the four dimensions than positive covariates, with depression, poor health, and loneliness being the greatest negative correlates and excellent health and older age being the greatest positive correlates. We reproduce previous evidence for a “midlife crisis” around age 50 across the four wellbeing dimensions. Notably, although salutogenic behaviors (diet, exercise, socializing) correlated with greater wellbeing, there were diminishing benefits beyond thresholds of about four hours a day spent socializing, four days per week of consuming fruits and vegetables, and four days per week of exercising. Findings suggest that wellbeing is easier lost than gained, underscore the influence that relatively malleable lifestyle factors have on wellbeing, and stress the importance of multidimensional measurement for public policy

    Linguistic analysis of empathy in medical school admission essays.

    Get PDF
    Objectives: This study aimed to determine whether words used in medical school admissions essays can predict physician empathy. Methods: A computational form of linguistic analysis was used for the content analysis of medical school admissions essays. Words in medical school admissions essays were computationally grouped into 20 \u27topics\u27 which were then correlated with scores on the Jefferson Scale of Empathy. The study sample included 1,805 matriculants (between 2008-2015) at a single medical college in the North East of the United States who wrote an admissions essay and completed the Jefferson Scale of Empathy at matriculation. Results: After correcting for multiple comparisons and controlling for gender, the Jefferson Scale of Empathy scores significantly correlated with a linguistic topic (r = .074, p \u3c .05). This topic was comprised of specific words used in essays such as understanding, compassion, empathy, feeling, and trust. These words are related to themes emphasized in both theoretical writing and empirical studies on physician empathy. Conclusions: This study demonstrates that physician empathy can be predicted from medical school admission essays. The implications of this methodological capability, i.e. to quantitatively associate linguistic features or words with psychometric outcomes, bears on the future of medical education research and admissions. In particular, these findings suggest that those responsible for medical school admissions could identify more empathetic applicants based on the language of their application essays

    Measuring the burden of infodemics : summary of the methods and results of the fifth WHO infodemic management conference

    Get PDF
    Background: An infodemic is excess information, including false or misleading information, that spreads in digital and physical environments during a public health emergency. The COVID-19 pandemic has been accompanied by an unprecedented global infodemic that has led to confusion about the benefits of medical and public health interventions, with substantial impact on risk-taking and health-seeking behaviors, eroding trust in health authorities and compromising the effectiveness of public health responses and policies. Standardized measures are needed to quantify the harmful impacts of the infodemic in a systematic and methodologically robust manner, as well as harmonizing highly divergent approaches currently explored for this purpose. This can serve as a foundation for a systematic, evidence-based approach to monitoring, identifying, and mitigating future infodemic harms in emergency preparedness and prevention. Objective: In this paper, we summarize the Fifth World Health Organization (WHO) Infodemic Management Conference structure, proceedings, outcomes, and proposed actions seeking to identify the interdisciplinary approaches and frameworks needed to enable the measurement of the burden of infodemics. Methods: An iterative human-centered design (HCD) approach and concept mapping were used to facilitate focused discussions and allow for the generation of actionable outcomes and recommendations. The discussions included 86 participants representing diverse scientific disciplines and health authorities from 28 countries across all WHO regions, along with observers from civil society and global public health–implementing partners. A thematic map capturing the concepts matching the key contributing factors to the public health burden of infodemics was used throughout the conference to frame and contextualize discussions. Five key areas for immediate action were identified. Results: The 5 key areas for the development of metrics to assess the burden of infodemics and associated interventions included (1) developing standardized definitions and ensuring the adoption thereof; (2) improving the map of concepts influencing the burden of infodemics; (3) conducting a review of evidence, tools, and data sources; (4) setting up a technical working group; and (5) addressing immediate priorities for postpandemic recovery and resilience building. The summary report consolidated group input toward a common vocabulary with standardized terms, concepts, study designs, measures, and tools to estimate the burden of infodemics and the effectiveness of infodemic management interventions. Conclusions: Standardizing measurement is the basis for documenting the burden of infodemics on health systems and population health during emergencies. Investment is needed into the development of practical, affordable, evidence-based, and systematic methods that are legally and ethically balanced for monitoring infodemics; generating diagnostics, infodemic insights, and recommendations; and developing interventions, action-oriented guidance, policies, support options, mechanisms, and tools for infodemic managers and emergency program managers.peer-reviewe
    • 

    corecore